Improving Map Reduce Performance in Heterogeneous Distributed System using HDFS Environment-A Review
نویسندگان
چکیده
Hadoop is a Java-based programming framework which supports for storing and processing big data in a distributed computing environment. It is using HDFS for data storing and using Map Reduce to processing that data. Map Reduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Map Reduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Unfortunately, both the homogeneity and data locality assumptions are not satisfied in virtualized data centers. Hadoop’s scheduler can cause severe performance degradation in heterogeneous environments. We observe that, Longest Approximate Time to End (LATE), which is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters.
منابع مشابه
Heterogeneous Multi core processors for improving the efficiency of Market basket analysis algorithm in data mining
-Heterogeneous multi core processors can offer diverse computing capabilities. The efficiency of Market Basket Analysis Algorithm can be improved with heterogeneous multi core processors. Market basket analysis algorithm utilises apriori algorithm and is one of the popular data mining algorithms which can utilise Map/Reduce framework to perform analysis. The algorithm generates association rule...
متن کاملMap-merging in Multi-robot Simultaneous Localization and Mapping Process Using Two Heterogeneous Ground Robots
In this article, a fast and reliable map-merging algorithm is proposed to produce a global two dimensional map of an indoor environment in a multi-robot simultaneous localization and mapping (SLAM) process. In SLAM process, to find its way in this environment, a robot should be able to determine its position relative to a map formed from its observations. To solve this complex problem, simultan...
متن کاملArchitecture for Hadoop Distributed File Systems
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economica...
متن کاملPerformance Improvement of Map Reduce through Enhancement in Hadoop Block Placement Algorithm
In last few years, a huge volume of data has been produced from multiple sources across the globe. Dealing with such a huge volume of data has arisen the so called “Big data problem”, which can be solved only with new computing paradigms and platforms which lead to Apache Hadoop to come into picture. Inspired by the Google’s private cluster platform, few independent software developers develope...
متن کاملImplementation of image processing system using handover technique with map reduce based on big data in the cloud environment
Cloud computing is the one of the emerging techniques to process the big data. Cloud computing is also, known as service on demand. Large set or large volume of data is known as big data. Processing big data (MRI images and DICOM images) normally takes more time. Hard tasks such as handling big data can be solved by using the concepts of hadoop. Enhancing the hadoop concept will help the user t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015